Neural Gas Clustering for Dissimilarity Data with Continuous Prototypes

نویسندگان

  • Alexander Hasenfuss
  • Barbara Hammer
  • Frank-Michael Schleif
  • Thomas Villmann
چکیده

Prototype based neural clustering or data mining methods such as the self-organizing map or neural gas constitute intuitive and powerful machine learning tools for a variety of application areas. However, the classical methods are restricted to data embedded in a real vector space and have only limited applicability to noneuclidean data as occurs in, for example, biomedical or symbolic fields. Recently, extensions of unsupervised neural prototype based clustering to dissimilarity data, i.e. data characterized in terms of a dissimilarity matrix only, have been proposed substituting the mean by the so-called generalized median. Thereby, the location of prototypes is chosen within the discrete input space which constitutes a severe limitation in particular for sparse data sets since the prototype flexibility is restricted. Here we present a generalization of median neural gas such that prototypes can be interpreted as mixtures of discrete input locations. We derive a batch optimization scheme based on a corresponding cost function.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Magnification Control in Relational Neural Gas

Prototype-based clustering algorithms such as the Self Organizing Map (SOM) or Neural Gas (NG) offer powerful tools for automated data inspection. The distribution of prototypes, however, does not coincide with the underlying data distribution and magnification control is necessary to obtain information theoretic optimum maps. Recently, several extensions of SOM and NG to general non-vectorial ...

متن کامل

Clustering Algorithm for Incomplete Data Sets with Mixed Numeric and Categorical Attributes

The traditional k-prototypes algorithm is well versed in clustering data with mixed numeric and categorical attributes, while it is limited to complete data. In order to handle incomplete data set with missing values, an improved k-prototypes algorithm is proposed in this paper, which employs a new dissimilarity measure for incomplete data set with mixed numeric and categorical attributes and a...

متن کامل

A supervised growing neural gas algorithm for cluster analysis

In this paper, a prototype-based supervised clustering algorithm is proposed. The proposed algorithm, called the Supervised Growing Neural Gas algorithm (SGNG), incorporates several techniques from some unsupervised GNG algorithms such as the adaptive learning rates and the cluster repulsion mechanisms of the Robust Growing Neural Gas algorithm, and the Type Two Learning Vector Quantization (LV...

متن کامل

Topographic mapping of dissimilarity datasets

A great challenge today, arising in many fields of science, is the proper mapping of datasets to explore their structure and gain information that otherwise would remain concealed due to the high-dimensionality. This task is impossible without appropriate tools helping the experts to understand the data. A promising way to support the experts in their work is the topographic mapping of the data...

متن کامل

Topographic Mapping of Large Dissimilarity Data Sets

Topographic maps such as the self-organizing map (SOM) or neural gas (NG) constitute powerful data mining techniques that allow simultaneously clustering data and inferring their topological structure, such that additional features, for example, browsing, become available. Both methods have been introduced for vectorial data sets; they require a classical feature encoding of information. Often ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007